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PREFACE 


The  work  described  in  this  report  was  authorized  under  Project  No.  1N6A, 
This  work  was  started  in  June  1991  and  completed  in  September  1992. 

In  conducting  the  research  described  in  this  report,  the  investigators  adhered  to 
the  "Guide  for  the  Care  and  Use  of  Laboratory  Animals,"  National  Institute  of  Health 
Publication  No.  86-23,  1985,  as  promulgated  by  the  committee  on  Revision  of  the  Guide  for 
Laboratory  Animal  Facilities  and  Care  of  the  Institute  of  Laboratory  Animal  Resources, 
Commission  of  Life  Sciences,  National  Research  Council  (Washington,  DC).  These 
investigations  were  also  performed  in  accordance  with  the  requirements  of  AR  70-18, 
Laboratory  Animals,  Procurement,  Transportation,  Use,  Care,  and  Public  Affairs. 

The  use  of  trade  names  or  manufacturers’  names  in  this  report  does  not 
constitute  an  official  endorsement  of  any  commercial  products.  This  report  may  not  be  cited 
for  purposes  of  advertisement. 

This  report  has  been  approved  for  release  to  the  public.  Registered  users 
should  request  additional  copies  from  the  Defense  Technical  Information  Center;  unregistered 
users  should  direct  such  requests  to  the  National  Technical  Information  Service. 
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QUALITY  ASSURANCE 


This  study,  governed  by  Protocol  Number  210910430000,  was 
examined  for  compliance  with  Good  Laboratory  Practices  as 
P^^iished  by  the  U.S.  Environmental  Protection  Agency  in  40  CFR 
Part  792  (effective  18  September  1989) .  The  dates  of  all 
inspections  and  the  dates  the  results  of  those  inspections  were 
reported  to  the  Study  Director  and  management  were  as  follows: 
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Date  Reported  to  Study 
Director /Management 
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Final  Report 
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18  July  1994 


05  Nov  1991 
15  Sept  1994 


To  the  best  of  my  knowledge,  the  methods  described  in  this 
report  were  the  methods  followed  during  the  study  as  indicated  by 
the  raw  data  found  in  the  laboratory  notebook.  The  report  was 
determined  to  be  an  accurate  reflection  of  the  raw  data  recorded. 


Kenneth  P .  Cameron 
Quality  Specialist 
Life  Sciences  Department 
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HYPERACTTVATED  RABBIT  SPERM  CELL  MOTILITY  PARAMETERS 


1.  INTRODUCTION 

A  recent  study  showed  that  hyperactivated  motility  of  rabbit  sperm  cells  was 
suppressed  by  metals  implicated  in  fertility  disturbances  and  not  by  others  devoid  of  this 
property.*  Estimations  of  hyperactivated  motility  decrease  were  based  on  subjective  visual 
observations,  and  although  Ae  decrease  was  found  in  several  replicates,  an  objective  method 
to  measure  hyperactivated  motility  is  desirable.  Hyperactivated  motility  is  necessary  for 
fertilization.  To  understand  the  mechanism  underlying  this  phenomenon,^  to  apply  it  to 
assessing  fertility  effects  associated  with  sperm  cells’  exposure  to  chemicals,*  or  to  use  it  for 
fertility  prediction  in  clinical  settings^’^  by  measuring  the  decline  in  hyperactivation,  great 
care  must  be  taken  in  developing  objective,  accurate,  and  dependable  rdes  for  classifying 
hyperactivated  and  non-hyperactivated  sperm.  Statistical,  analytical  methods,  based  on  the 
motion  parameters  determined  by  motion  analytical  systems,  were  brought  to  bear  on  the 
problem  of  identifying  either  the  hyperactivat^  or  non-hyperactivated  state  of  individual 
sperm  cells  in  a  mixed  population.  Detailed  statistical  analyses  were  used  to  investigate  and 
understand  the  relationship  between  the  components  of  flagellar  motion,  their 
interrelationship,  and  their  relationship  to  hyperactivity.  Cell  state  was  modeled  as  a 
functi  !  of  motion  parameter  values,  and  model  effectiveness  was  assessed  in  terms  of 
misclassification  error.  The  results  of  the  investigation  are  presented  in  this  report. 


2.  MATERIALS  AND  METHODS 

2.1  Analysis  of  Sperm  Cells. 

Videotapes  of  the  motion  of  rabbit  sperm  cells  that  did  not  develop 
hyperactivated  motility  after  incubation  for  1  or  2  hr,  and  those  that  developed  hyperactivated 
motility  after  16-20  hr  incubations*’*  were  used  for  analyses.  Analysis  with  the  CellSoft 
system  and  methods  for  developing  hyperactivated  motility  were  carried  out  as  previously 
described. The  settings  for  the  CellTrak  system  (Motion  Analysis  Corporation,  Santa 
Rosa,  CA)  were  frame  rate,  30  frames/s;  duration  of  frame  capture,  30  frames;  minimum 
path  length,  15  frames;  minimum  burst  speed,  20  /um/s;  maximum  burst  speed,  500  /xm/s; 
distance  scale  factor,  1.839  /im/pixel;  camera  aspect  ratio,  1.0;  amplitude  of  lateral  head 
(ALH)  path  smoothing  factor,  7  frames;  centroid  X  and  Y  search  neighborhood,  4  and  2 
pixels,  respectively;  centroid  cell  size  minimum  and  maximum,  2  and  25  pixels,  respectively; 
maximum  path  interpolation,  1  frame;  path  prediction  percentage,  0%. 

Hyperactivated  sperm  cells  were  identified  using  criteria  previously  defined.*’® 
When  necessary,  close  visual  inspection  of  the  videotape  as  carried  out  frame  by  frame  to 
ensure  correct  classification  of  the  motility  type. 
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The  motion  parameters  measured  were  curvilinear  velocity  (VC),  straight  line 
velocity  (VST),  linearity  (LIN),  maximum  amplitude  of  lateral  head  (MALH)  displacement, 
average  amplitude  of  lateral  head  (AALH)  displacement,  beat  cross  frequency  (BCF), 
straightness  (STR),  wobble  (WOB),  AALH/LIN,  and  VC  x  AALH. 

2.2  Statistical  Procedures. 

The  statistical  analysis  was  completed  in  the  following  four  stages: 

•  Univariate  examination  of  each  motility  characteristic  between  the  classes 
of  hyperactivated  and  non-hyperactivated  motility  was  based  on  sample  means,  standard 
deviations,  relative  frequency  distributions,  and  boxplots.  An  indication  of  variable 
importance  was  obtained  by  using  the  p-values  for  the  Mann-Whitney  test. 

•  Joint  contribution  of  variables  to  classification  were  explored  graphically 
using  scatter  plots  provided  by  NCSS  version  5.1,  1987,  and  BMDP  1983,  program  6D. 

•  Classification  was  pursued  using  standard  discriminant  analysis  and  newer 
tree  structured  methods  with  available  software.  Stepwise  discriminant  analysis, 
complimented  by  binary  regression,  was  performed  using  BMDP  statistical  software  (BMDP 
1983,  programs  7M,  IR,  and  9R). 

•  The  Classification  And  Regression  Trees  (CART™,  Version  1.1,  California 
Statistical  Software,  Inc.,  Belmont,  CA)  and  A  Fast  Algorithm  for  Classification  Trees 
(FACT,  Version  1.1,  Software  Development  and  Distribution  Center,  MACC,  University  of 
Wisconsin,  Madison,  WI)  software  were  used  to  establish  a  decision  tree  for  classification. 
CART™  was  principally  used  with  the  FACT  results  serving  to  corroborate.  Final  results 
for  misclassification  errors  were  computed  using  cross  validation. 


3.  RESULTS  AND  DISCUSSION 

3.1  Motion  Parameter  Statistics, 

Summary  statistics  for  each  of  the  10  motion  parameters  for  322  hyperact¬ 
ivated  and  899  non-hyperactivated  sperm  cells  are  given  in  Table  1.  More  detailed 
information  is  given  in  histograms  appearing  in  the  Appendix.  The  sample  mean  and 
standard  deviation  give  an  indication  of  where  the  center  portion  of  the  data  lies,  and  the 
extreme  points  bound  the  values  observed.  Some  unusual  values  are  found  in  the  table.  The 
maximum  for  VC,  MALH,  and  AALH  is  more  than  5  standard  deviations  from  the  mean, 
and  for  AALH/LIN  and  VC*AALH,  the  maximum  is  more  than  18  and  9  standard 
deviations,  respectively.  Hyperactivated  cells  generally  show  smaller  values  for  VST,  LIN, 
BCF,  STR,  and  WOB  (Table  1).  For  all  motion  parameters  but  LIN,  the  standard  deviation 
differs  between  classes;  in  particular,  note  MALH,  AALH,  AALH/LIN,  and  VC*AALH. 
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Table  1.  Summary  Statistics  for  the  Motility  Parameters  Determined 
for  both  Hyperactivated  and  Non-hyperactivated  Cells 


Motility  Sample 


Parameter 

Class 

Mean  ±  SD 

Minimum 

Maximum 

Range 

VC 

hyper 

137.6  ±  52.0 

51.0 

344.8 

293.8 

non-hyper 

83.1  ±  35.7 

23.2 

191.3 

168.1 

VST 

hyper 

30.4  ±  21.1 

0.1 

104.4 

1043 

non-hyper 

71.0  ±  35.5 

0.7 

175.5 

174.8 

LIN 

hyper 

0.24  ±  0.18 

0.01 

0.74 

0.73 

non-hyper 

0.85  ±  0.18 

0.02 

0.99 

0.97 

MALH 

hyper 

9.9  ±  4.9 

0.2 

27.8 

27.6 

non-hyper 

2.3  ±  1.3 

0.6 

10.9 

10.3 

AALH 

hyper 

7.1  ±  3.3 

1.1 

20.1 

19.0 

non-hyper 

1.6  ±  0.9 

0.4 

10.9 

10.5 

BCF 

hyper 

12.2  ±  5.5 

1.2 

25.9 

24.7 

non-hyper 

15.0  ±  3.8 

1.2 

27.2 

26.0 

STR 

hyper 

0.58  ±  0.30 

0.02 

0.98 

0.96 

non-hyper 

0.90  ±  0.14 

0.04 

0.99 

0.95 

WOB 

hyper 

0.40  ±  0.15 

0.01 

0.79 

0.78 

non-hyper 

0.94  ±  0.08 

0.23 

1.00 

0.77 

AALH/LIN 

hyper 

92.5  ±  192.2 

2.2 

2008.0 

2005.8 

non-hyper 

2.4  ±  3.7 

0.4 

53.5 

53.1 

VC*AALH 

hyper 

1124.3  ±  984.3 

96.4 

6459.7 

6363.3 

non-hyper 

144.9  ±  128.3 

16.4 

722.9 

706.5 

Summary  statistics  for  each  motihty  parameter  are  reported  indivi¬ 
dually  for  each  cell  state.  The  mean  ±  the  sample  standard  devia¬ 
tion,  for  the  population,  gives  information  as  to  the  location  of  the 
majority  of  motihty  parameter  values.  The  minimum,  maximum, 
and  range  provide  iiiformation  as  to  the  extremes.  All  summary 
statistics  were  computed  using  BMDP  statistical  software. 
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Data  in  Table  1  suggest  that  the  differences  in  means  and  standard  deviations  would  be 
statistically  significant.  This  was  confirmed  by  the  nonparametric  Mann-Whitney  Test  for 
location  and  the  Squared  Ranks  test  for  variances  (p  <0.01). The  caveat  to  this  is  that  the 
enormous  sample  sizes  (322  hyperactivated  cells  and  899  non-hyperactivated  cells)  will  cause 
the  power  of  the  test  to  be  quite  high,  even  for  relatively  small  differences  between  the 
hypothesized  value  of  the  parameter  in  question  and  its  alternative. 

3.2  Single  Classification. 

A  difference  in  distribution  location  between  motility  classes  for  a  motility 
pai^eter  only  hints  that  the  parameter  might  be  useful  in  classification.  The  extent  to 
which  the  distributions  overlap  must  be  examined,  because  it  is  within  the  intervals  where 
overlap  occurs  that  the  potential  exists  for  misclassification.  Figure  1  illustrates  the  overlap 
for  each  of  the  motility  parameters  using  stacked  boxplots  of  the  hyperactivated  (H)  and  non- 
hyperactivated  (N)  class  distributions.  The  basic  form  of  the  boxplot,  consisting  of  the 
quartiles  and  the  minimum  and  maximum  values,  was  used.  For  this  figure,  all  motility 
parameter  values  were  standardized,  using  the  combined  data  mean  and  standard  deviation 
for  the  scaling.  This  permitted  the  simultaneous  viewing  of  the  distributions  for  each 
parameter  and  comparison  of  all  regarding  their  potential  for  use  as  classifiers.  The 
numerical  value  is  given  for  standardized  values  beyond  four  standard  deviations  from  the 
mean.  Figure  1  shows  that  for  the  parameters  LIN  and  WOB,  at  most,  25%  of  the  non- 
hyperactivated  cells  show  values  that  are  similar  to  those  of  the  hyperactivated  class.  It  is 
likely  that  AALH,  MALH,  and  VC*AALH  will  also  be  reasonable  classifiers,  based  on  the 
degree  of  separation  of  motility  classes  seen  between  the  boxes  representing  the  middle  50% 
of  the  data.  The  BCF  provides  an  example  of  a  parameter  with  limited  classifying  potential. 

The  relative  frequency  distributions  for  LIN,  VC,  AALH,  and  WOB  are 
compared  between  the  two  motility  classes  in  Tables  2  and  3.  Linearity,  VC,  and  AALH 
were  selected  because  of  their  prominence  in  the  literature,^"*  and  WOB  was  selected  for  its 
importance  in  this  study.  Hyperactivated  cells  were  absent  in  the  0.8  -  1.0  interval  for  both 
LIN  and  WOB  (Table  2),  and  conversely  high  percentages  of  non-hyperactivated  cells,  LIN, 
75.5%  and  WOB,  94.3%  were  found  within  this  interval.  This  strongly  suggests  good 
classifying  potential  for  each.  The  AALH  shows  only  minimal  distribution  overlap,  and  VC 
has  somewhat  more.  The  individual  concomitants  of  hyperactivation  suggested  by  Mortimer 
and  Mortimer  for  human  sperm**  are  consistent  with  these  results  despite  the  fact  that  rabbit 
sperm  values  are  reported  here. 

3.3  Multiple  Classification. 

The  scatterplots  in  Figure  2  show  the  relationship  between  the  paired  values  of 
the  four  motility  parameters  discussed  above  and  each  motility  class.  Each  possible  pairing 
for  VC,  LIN,  and  AALH  is  represented,  as  well  as  the  pairing  for  VC  and  WOB.  The 
symbol,  h,  indicates  the  presence  of  one  or  more  hyperactivated  cells  with  values  of  the  two 
motility  parameters  defining  its  position;  c  denotes  non-hyperactive  cells,  and  an  asterisk 
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AALH/LIN 


VC*AALH 


H 


(9.0) 


-4-3-2-10  1  2  3  4 

Z-Scorc 


Figure  1.  Boxplots  of  the  Standardized  Motility  Parameter 
Distributions.  The  graphical  summary  allows  a 
quick  comparison  of  ^  motility  parameters  in 
their  ability  to  separate  on  the  basis  of  hyperac¬ 
tivation.  Tbe  box  is  formed  from  the  first  and 
third  quartiles,  with  the  median  indicated  as  a 
vertical  line  within  the  box.  The  extremes  are 
coimected  to  the  box  with  a  line  segment. 
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Table  2.  Relative  Frequency  Distributions  (Given  in  Percents) 
of  WOB  and  ON  for  both  Hyperactivated  (n=322) 
and  Non-hyperactivated  (n=899)  Cells 


LIN  WOB 


Interval 

Hyper 

Non-hyper 

Interval 

Hyper 

Non-hyper 

0.0  -  0.1 

25.8 

0.8 

0.0  -  0.1 

1.2 

0.0 

0.1  -  0.2 

24.2 

1.1 

0.1  -  0.2 

5.9 

0.0 

0.2  -  0.3 

19.3 

0.8 

0.2  -  0.3 

23.0 

0.4 

03  -  0.4 

12.4 

1.1 

0.3  -  0.4 

27.7 

0.0 

0.4  -  03 

9.3 

2.2 

0.4  -  0.5 

18.3 

0.3 

0.5  -  0.6 

4.0 

3.8 

0.5  -  0.6 

11.8 

0.4 

0.6  -  0.7 

3.4 

5.9 

0.6  -  0.7 

7.1 

1.3 

0.7  -  0.8 

1.6 

8.8 

0.7  -  0.8 

5.0 

3.3 

0.8  -  0.9 

0.0 

22.2 

0.8  -  0.9 

0.0 

7.1 

0.9  -  1.0 

0.0 

53.3 

0.9  - 1.0 

0.0 

87.2 

Table  3.  Relative  Frequency  Distributions  (Given  in  Percents) 
of  VC  and  AAOI  for  both  Hyperactivated  (n=322) 
and  Non-hyperactivated  (n=899)  Cells 


Interval 

VC 

Hyper 

Non-hypei 

0-20 

0.0 

0.0 

20-40 

0.0 

7.6 

40-60 

4.0 

25.3 

60-80 

6.2 

21.6 

80-100 

11.8 

16.6 

100-120 

19.0 

10.2 

120-140 

18.6 

9.8 

140-160 

14.9 

5.6 
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16.8 
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0.0 

14-16 

1.5 

0.0 

16-18 

1.9 

0.0 

18-20 

0.3 

0.0 

20- 

0.3 

0.0 

The  frequency  distributions  shown  provide  a  refined 
description  of  the  pattern  of  variability  for  each  of  the 
motility  parameters  shown.  All  frequency  distributions 
were  constructed  using  BMDP  statistical  software. 
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Figure  2.  Scatterplots  of  Motion  Parameters  with  Qass  Identifiers  for  Hyperactivated 
(h)  and  Non-hyperactivated  (c).  The  scatterplots  produced  using  BMDP 
show  the  degre  of  class  separation  attainable  with  motility  parameter  pairs. 
The  symbol,  c,  originally  represented  circular  or  linear  behavior.  It  was 
retained  in  this  figure  because  it  visually  contrasts  well  with  the  symbol,  h. 
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designates  both  classes  of  cells.  Using  the  partitions  suggested  by  Mortimer  and  Mortimer," 
the  first  three  scatterplots  are  divided  into  quadrants,  with  the  symbol,  H,  denoting  the 
quadrant  where  the  values  of  both  motility  parameters  were  consistent  with  hyperactivated 
motility.  For  the  fourth  plot  of  VC  and  WOB,  partitioning  was  achieved  in  a  manner  to  be 
addressed  later.  It  is  apparent  that  in  each  scatterplot,  the  quadrant  designated  for 
hyperactivated  cells  contains  few  non-hyperactivated  cells.  The  least  pure  is  the  partition 
formed  on  VC  and  LIN.  For  all  but  the  VC  and  WOB  plot,  hyperactivated  cells  were  also 
plentiful  in  other  quadrants,  suggesting  that  classification  rules  based  on  these  partitions 
would  be  adequate  to  identify  a  cell  as  hyperactivated.  However,  classification  rules  based  on 
these  partitions  would  not  be  adequate  for  correctly  classifying  all  cells  in  a  mixed  population 
of  hyperactivated  and  non-hyperactivated  cells.  Analogous  arguments  for  higher  dimensions 
can  be  made. 


Several  workers  have  advocated  the  use  of  VC,  LIN,  AALH,  or  MALH  in 
combination  for  classifying  hyperactivated  sperm  cells.^  '*’"  These  rules  were  based  solely  on 
the  subjective  extension  of  single  motility  parameters  as  classifiers.  We  have  taken  a 
comprehensive  and  objective  analytical  approach  by  applying  regression  and  discriminant 
analysis  to  the  problem  of  using  multiple  motility  parameters  to  classify  sperm  cell  motility. 
Discriminant  analysis  can  be  used  to  separate  classes  based  on  a  linear  compound  of  the 
motility  parameters.  This  compound  is  simply  a  one-dimensional  index  that  can  be  used  to 
classify  the  observations  into  groups.  In  this  simple  two-group  environment,  discriminant 
analysis  is  ^alogous  to  performing  a  regression  analysis  on  a  binary  (0,1)  class  variable  and 
then  assigning  an  observation  to  class  one  if  the  predicted  value  is  0.5  or  greater  and  to  class 
zero  otherwise.  The  BMDP  statistical  software  supporting  both  discriminant  analysis  and 
regression  was  used  to  model  the  relationship  among  motility  parameters  and  class 
assignment,  with  more  emphasis  being  given  the  regression  approach. 

The  rationale  for  using  both  regression  and  discriminant  analysis  routines  to 
support  the  derivation  of  motility  classification  rules  was  to  offset  a  failure  to  meet  the 
assumptions  of  the  formal  discriminant  analysis  and  to  make  use  of  greater  flexibility  in  the 
regression  routines.  Discriminant  analysis  assumes  that  the  variables  used  to  classify  groups 
come  from  multivariate  normal  distributions,  which  differ  only  in  location.  This  assumption 
is  violated  by  the  apparent  nonnormality  of  many  of  the  motility  parameters  (Figure  1  and 
appendix).  The  common  covariance  matrix  assumption  is  also  doubtful  (Table  1).  Without 
these  assumptions,  the  computed  probability  of  class  membership  for  each  cell  is  invalid. 
However,  successful  applications  are  possible  when  assumptions  are  violated  (see  Reference 
12)  by  using  the  discriminant  index  as  a  measure  of  separation  between  classes,  devaluing  its 
use  in  forming  a  probability  of  class  membership.  The  advantage  afforded  by  regression  is 
that  the  regression  routines  are  more  convenient  for  conducting  variable  selection  and 
checking  model  adequacy.  In  regression,  the  predicted  value  for  each  cell  is  used  as  a 
relative  score  for  class  assignment,  relying  only  on  the  assumptions  usually  made  for  a  least- 
squares  fit. 
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Table  4  summarizes  the  results  of  a  subset  of  best  models  established  using 
stepwise  discriminant  analysis  and  all  possible  subsets  regression.  The  models  are  labeled 
discriminant  1  (Dl)  -  discriminant  24  (D24).  Three  models  using  the  motility  parameters 
recommended^’**’”  are  given  as  D25-D27.  Models,  for  a  fixed  number  of  motility 
characteristics,  are  listed  in  the  order  of  decreasing  R^.  The  models  were  evaluated  in  terms 
of  their  efficiency,  which  is  defined  as  the  ability  to  correctly  classify  both  hyperactivated 
and  non-hyperactivated  cells.  In  this  analysis,  classifying  a  hyperactivated  cell  as  non- 
hyperactivated  was  as  grievous  an  error  as  classifying  a  non-hyperactivated  cell  as 
hyperactivated.  The  percentage  of  correctly  classified  cells  (hyperactivated  and  non- 
hyperactivated)  was  used  to  define  efficiency.  In  computing  this  percentage  resubstitution 
error,  random  cross  validation  and  the  test  sample  method  were  used.  Resubstitution  error 
was  measured  by  establishing  the  classification  rules  based  on  a  data  set  and  then 
implementing  the  rule  on  the  same  data  set  to  compute  the  efficiency.  The  obvious  problem 
with  using  only  resubstitution  error  is  that  there  is  no  way  to  gauge  the  sensitivity  of  the 
established  rules  to  variations  in  the  data,  thereby  weakening  claims  of  general  applicability. 
Cross  validation,  a  frequently  used  method  to  address  this  concern,  was  accomplished  either 
by  randomly  targeting  many  subsets  of  the  data  against  which  to  implement  the  rule,  or  by 
forming  the  rule  based  on  a  large  portion  (75%)  of  the  data,  and  evaluating  its  performance 
against  the  remaining  25  % .  Information  on  the  sensitivity  of  the  classification  rules  to 
variations  in  the  data  was  gathered  by  use  of  either  of  the  cross  validation  approaches. 

During  variable  selection,  the  resubstitution  error  was  used  in  determining  efficiency  because 
it  allowed  direct  comparisons  among  models.  Final  results  are  reported  in  terms  of  cross 
validation. 


Other  factors  important  in  model  derivation  were  the  needs  for 
parsimoniousness  and  the  avoidance  of  colinearity.  In  terms  of  a  regression  model,  the 
explained  variation  or  R^  should  be  as  high  as  possible  consistent  with  the  requirement  that 
the  model  be  simple  with  a  minimum  of  measures.  This  is  equivalent  to  striving  for  low 
values  of  Wilk'  ’  ’ambda  in  the  discriminant  analysis.  The  residuals  were  not  to  suggest  a 
model  inadequ  For  example,  suggesting  that  a  quadratic  expressior  one  of  the 
variables  woulu  uave  been  more  appropriate.  Lastly,  multicolinearity,  a  statistical 
redundancy  among  variables  in  the  model,  was  avoided  to  avert  the  danger  that,  although 
prediction  may  seem  to  improve  with  correlated  variables  in  the  model  for  the  data  set 
examined,  the  stability  of  the  classifying  rule  for  other  data  sets  becomes  suspect. 

With  these  points  in  mind.  Table  4  shows  that  models  based  on  WOB,  LIN, 
AALH,  MALH,  and  VC*AALH  gave  efficiencies  >90%.  WOB  was  the  best  performer, 
the  order  being  WOB  >  (.IN  >  AALH  >  MALH  >  VC*AALH.  The  proportion  of  the 
variation  associated  with  class  distinction  that  is  explained  by  WOB  is  0.838.  The  stepwise 
discriminant  rule  established  would  misclassify  22  hyperactivated  cells  as  being  non- 
hyperactivated  and  19  non-hyperactivated  cells  as  being  hyperactivated  for  an  overall 
classification  efficiency  of  96.64%.  At  this  stage  of  reporting  efficiencies  are  g^/en  in  terms 
of  resubstitution  misclassification  error.  Using  the  regression  del,  31  hypers  j.vated  cells 
and  15  non-hyperactivated  cells  were  misclassified  for  a  classii  nation  efficiency  of  96.23%. 
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Table  4.  Summary  of  Best  Models  Using  Discriminant/Regression  Analysis 
Based  on  322  Hyperactivated  and  899  Non-hyperactivated  Cells 


_ Variables _ H(missed)  NH(missed)  Efficiency  (%) 


D1 

D2 

D3 

D4 

D5 

D6 

D7 

D8 

D9 

WOB 

LIN 

AALH 

MALH 

VC*AALH 

STR 

VC 

VST 

AALH/LIN 

22/31 

21/34 

59/91 

63/109 

113/188 

132/171 

108  /  209 

56/210 

190  /  283 

19/15 

63/46 

5/4 

16/9 

4/0 

78/36 

205  /  56 
301/26 
1/0 

y  /0) 

96.64  /  96.23 
93.12  /  93.45 

94.76  /  92.22 
93.53  /  90.34 
90.42  /  84.60 
82.80  /  83.05 
7437  /  78.30 

70.76  /  80.67 
8436  /  76.82 

K 

0.838 

0.702 

0.639 

0.600 

0.411 

0356 

0.261 

0.235 

0.140 

DIO 

WOB,  AALH 

21/32 

16/10 

96.97  /  %36 

0.856 

Dll 

WOB,  VC 

23/30 

12/11 

97.13  /  96.64 

0.847 

D12 

WOB,  MALH 

22/31 

16/10 

96.89  /  96.64 

0.846 

D13 

WOB,  VST 

22/30 

16/12 

96.89  /  96.56 

0.843 

D14 

WOB,  VC*AALH 

24/32 

16/12 

96.72  /  96.40 

0.840 

D15 

WOB,  AALH, 
VC*AALH 

16/23 

15/11 

97.46  /  97.22 

0.856 

D16 

WOB,  AALH, 

STR 

24/26 

13/11 

96.97  /  96.97 

0.851 

D17 

WOB,  AALH, 
AALH/LIN 

20/30 

16/10 

97.05  /  96.72 

0.851 

D18 

WOB,  MALH, 

STR 

19/27 

15/13 

97.22  /  96.72 

0.850 

D19 

WOB,  STR, 

VC 

24/29 

12/10 

97.05  /  96.81 

0.850 

D20 

WOB,  AALH, 

STR,  VC*AALH 

15/22 

12/11 

97.79  /  97.30 

0.860 

D21 

WOB,  AALH, 
VC*AALH,  VC 

16/23 

11/10 

97.79  /  9730 

0.860 

D22 

WOB,  LIN, 

STR,  VC 

17/20 

14/11 

97.46  /  97.46 

0.856 

D23 

WOB,  LIN, 

AALH,  STR 

16/19 

14/13 

97.54  /  9738 

0.859 

D24 

WOB,  AALH, 
VC*AALH,  VST 

16/23 

12/11 

97.71/97.22 

0.858 

D25 

VC,  LIN, 

AALH 

23/34 

34/29 

9533  /  94.84 

0.757 

D26 

VC,  LIN, 

MALH 

24/38 

40/30 

94.76  /  94.43 

0.746 

D27 

VC,  LIN, 

VC*AALH 

24/40 

50/36 

93.78  /  93.78 

0.729 

This  table  shows  the  nmnber  of  cells  misclassified  by  each  of  27  BMDP- 
produced  models,  listing  the  overall  efficiency  of  classification  for  each. 
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All  "best"  models  using  two,  three,  or  four  motility  parameters  determined  by  either 
discriminant  or  regression  analysis  contain  WOB  as  one  of  the  parameters.  Further,  though 
not  shown  for  each  model,  WOB  was  the  largest  contributor  to  explained  variation  for  all 
models.  The  gain  in  efficiency  by  adding  additional  motility  parameters  to  WOB  must  be 
considered  modest.  Finally,  three-variable  models  (D25-D27)  using  the  motility  parameters 
most  popular  in  the  literature  show  an  efficiency  less  than  that  achieved  by  WOB  alone! 

The  large  number  of  models  shown  in  Table  4,  which  reasonably  could  be 
used  to  predict  hyperactivated  motility,  was  culled  based  on  their  sensitivity  in  predicting 
hyperactivated  motility  for  other  sperm  samples.  This  involved  first  looking  at  cross 
validation  results  in  misclassification.  Although  there  was  some  variation  in  cross  validation 
rates  relative  to  the  resubstitution  rates,  there  were  no  instances  so  different  to  suggest 
eliminating  any  of  the  possibilities  on  that  basis  alone.  The  question  of  parsimony  is  largely 
a  judgment  as  to  how  much  is  being  contributed  through  adding  more  terms  in  the  model  and 
at  what  risk  of  multicolinearity.  Table  5  shows  the  correlation  structure  among  the  motility 
parameters  used  in  the  models.  For  example,  the  correlation  between  VC*AALH  and 
AALH  is  0.933.  This  means  that  AALH  is  capable  of  explaining  87%  (0.933  squared)  of 
the  variation  of  VC*AALH.  The  implication  is  that  AALH  and  VC*AALH  are  too  close 
statistically  to  be  used  as  predictors  in  the  same  model.  Similarly,  a  0.904  correlation  exists 
between  WOB  and  LIN.  They  too  were  judged  too  close  statistically.  These  results 
effectively  eliminate  the  "best"  four-term  models  (D20-D24)  as  well  as  the  best  of  the  three 
variable  models  (D15)  from  consideration.  Considering  parsimony  leaves  only  models  DIO 
and  Dll,  if  not  just  Dl.  Consider  DIO  and  Dll.  They  misclassify  37  and  35  cells, 
respectively.  The  best  of  the  remaining  three-term  models  misclassifies  34  cells.  The  slight 
increase  in  efficiency  does  not  warrant  the  inclusion  of  a  third  term. 

Li  summary,  WOB,  WOB  and  AALH,  or  WOB  and  VC  are  the  preferred 
models  on  which  to  base  classification  rules.  The  motUity  parameters  WOB  and  AALH  are 
more  correlated  than  WOB  and  VC,  and  therefore  run  a  greater  risk  of  inflating  the  standard 
error  of  prediction.  Thus,  the  best  choice  would  be  the  latter  model  based  on  WOB  and  VC. 
The  regression  form  of  that  model  would  be  Predicted  Class  =  -0.332250  -0.000985 VC  + 
1.456690WOB.  If  the  predicted  value  for  class  was  closer  to  zero  than  to  one,  codes  used 
for  hyperactivated  and  non-hyperactivated  cells,  respectively,  the  cell  would  be  classified 
hyperactivated;  otherwise,  non-hyperactivated.  For  example,  if  VC  =  150  and  WOB =0.5, 
then  the  class  prediction  is  0.25,  indicating  a  hyperactivated  cell.  With  0.5  equidistant  from 
the  class  identifiers,  we  may  equivalently  express  the  constraint  for  hyperactivity  as 
WOB  <  0.571330  +  0.000676VC.  The  corresponding  discriminant  model  would  indicate 
hyperactivity  if  WOB  <  0.596416  0.000675VC.  There  is  little  difference  between  the 

approaches  as  long  as  the  terms  in  the  model  have  good  predictive  ability.  Some  difference 
would  be  expected,  for  example,  with  a  model  based  on  VST  and  AALH/LIN.  The 
regression  rule  for  the  model  WOB  and  AALH  would  be  to  classify  a  cell  as  hyperactivated 
if  WOB  <  0.564818  +  0.018096  AALH.  A  regression  classification  rule  based  on  WOB 
alone  would  partition  the  cells  at  a  WOB  value  of  0.646,  with  WOB  being  less  than  that 
value,  indicating  hyperactivity. 
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Table  5.  Correlation  Matrix  of  Motility  Parameters 
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3.4 


CART. 


An  approach  distinct  from  the  regression  and  discriminant  analyses  above  is 
given  by  tree-structured  classification.  CART  was  the  principal  software  used;  FACT 
software  was  used  for  corroboration.  Only  the  CART  results  are  reported.  The  CART 
routine  offers  many  options;  only  the  defaults  were  used.  Generally,  for  univariate  splits, 
CART  works  as  follows.  Each  possible  predictor  variable  (motion  parameter)  for  class  is 
examined  individually.  For  an  individual  variable,  the  program  searches  all  the  values, 
resting  at  each  one  to  see  how  efficient  it  would  be  to  partition  the  data  into  the 
hyperactivated  and  non-hyperactivated  classes  based  on  that  value.  (In  our  data  set,  this 
requires  over  1200  assessments  of  efficiency  for  each  variable.)  The  routine  notes  the  best 
value  for  that  variable  based  on  classification  efficiency.  The  variable,  which  partitions  ti 
data  in  the  most  efficient  manner,  is  selected,  and  its  value  is  used  as  the  first  partition  of  the 
data,  creating  two  nodes,  one  each  for  hyperactivated  and  non-hyperactivated  classification. 
Within  each  node,  some  cells  may  be  misclassified.  The  routine  then  searches  among  the 
variables  to  further  partition  the  two  nodes  to  increase  efficiency.  Eventually,  the  routine 
settles  on  a  decision  tree  for  classification  with  maximum  efficiency,  subject  to  the  constraint 
th?‘  ^ee  complexity  should  not  be  great.  A  great  advantage  of  tree-structured  methods  is 
tl  .>e  are  no  longer  bound  by  a  linear  model  as  we  were  in  the  regression  and  discriminant 
aii...yses,  although  linear  combinations  of  variables  can  be  considered.  In  running  CART,  all 
the  motility  parameters  previously  considered  as  possible  predictors  were  included.  The 
result  was  that  CART  chose  only  WOB  and  VC,  with  the  rule:  classify  as  hyperactivated  if 
WOB  <  0.775  and  VC  >  50.5.  Of  the  1221  cases  examined,  only  12  non-hyperactivated 
cellr  md  2  hyperactivated  cells  were  misclassified  for  an  efficiency  of  98.85%.  This 
efficiency  is  higher  than  that  of  any  of  the  previously  discussed  models.  Despite  the 
unusually  low  value  for  VC,  compared  to  the  literature,^  '*  ”  this  rule  has  great  appeal  in 
considering  the  data  in  Figure  2.  There,  the  incidence  of  non-hyperactivated  cells  with  low 
WOB  and  low  VC  is  high  enough  to  cast  doubt  on  a  model  based  on  WOB  alone.  In  this 
use,  VC  is  merely  refining  a  classification  mle  based  primarily  on  WOB. 

The  use  of  LIN,  AALH,  and  VC  was  also  investigated.  CART  did  not  choose 
to  use  VC.  The  tree  was  slightly  more  complex,  having  five  nodes  instead  of  three  as 
above.  The  classification  efficiency  was  96.47%.  When  a  model  based  on  WOB  and  AALH 
was  attempted,  CART  did  not  choose  to  use  AALH,  opting  instead  for  a  rule  based  only  on 
WOB  for  an  efficiency  of  96.97%.  Other  runs  using  linear  combinations  of  variables  were 
attempted  but  resulted  in  more  complex  decision  trees. 

In  summary,  of  all  of  the  CART  models  examined,  one  of  the  simplest  to 
implement  was  also  the  best.  The  model  based  on  WOB  and  VC  performed  most  efficiently 
in  classifying  hyperactivated  and  non-hyperactivated  cells  with  the  least  penalty  in  model 
complexity. 
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3.5 


Comparison  of  Models. 


Figure  3  illustrates  the  decision  criteria  delivered  by  the  discriminant  and 
CART  models  using  VC  and  WOB.  To  understand  the  model  differences,  we  have 
partitioned  the  point  set  WOB  X  VC,  where  WOB  ranges  from  0.0  to  1.0,  and  VC  ranges 
from  0  to  350,  according  to  the  hyperactivity  decision  rules  for  each  model.  A  cell  whose 
WOB  and  VC  values  locate  it  in  a  shaded  region  would  be  classified  as  non-hyperactivated 
by  CART.  The  unshaded  region  corresponds  to  a  hyperactivated  classification  delivered  by 
CART.  The  bold  line  represents  the  discriminant  model.  Points  falling  below  that  line 
would  be  classified  as  hyperactivated;  whereas,  those  above  the  line  would  be  classified  as 
non-hyperactivated.  (The  regression  model  is  not  shown  but  would  appear  nearly  coincident 
with  the  discriminant  model.)  Within  each  region,  we  have  indicated  the  true  number  of 
hyperactivated  and  non-hyperactivated  cells  present.  From  this  data,  one  can  see  the 
similarities  and  differences  of  the  model  rules  and  assess  their  relative  performance. 

First,  consider  the  rectangular  region  within  which  CART  would  classify  cells 
as  hyperactivated.  Below  the  discriminant  model  there  were  299  hyperactivated  cells 
correctly  classified  and  3  non-hyperactivated  cells  incorrectly  classified.  In  the  same  region 
but  above  the  discriminant  model,  there  were  21  hyperactivated  cells  correctly  classified  by 
CART,  and  9  non-hyperactivated  cells  were  incorrectly  classified.  Note  that  the  discriminant 
model  would  have  incorrectly  classified  the  21  hyperactivated  cells  while  correctly  classifying 
die  9  non-hyperactivated  cells.  CART  is  12  cells  more  accurate  than  the  discriminant  model 
in  this  region.  In  the  shaded  regions  (Figure  3)  above  the  discriminant  model,  their 
performance  is  identical,  incorrectly  classifying  2  hyperactivated  cells  and  correctly 
classifying  878  non-hyperactivated  cells.  A  difference  is  seen  again  for  the  shaded  region 
corresponding  to  low  values  of  WOB  and  VC.  There,  the  discriminant  model  would 
incorrectly  classify  9  non-hyperactivated  cells,  bringing  the  CART  performance  advantage  to 
21  cells.  This  figure  also  shows  that  using  VC  to  establish  a  lower  threshold  is  beneficial  in 
improving  a  classification  by  WOB  alone.  Twenty-three  cells  would  have  been  incorrectly 
classified  using  a  WOB  criterion  without  considering  VC.  Our  inference  is  that  WOB  is  a 
stable  measure  and  good  classifier  except  for  very  slow  moving  cells,  which  WOB  sometimes 
errantly  classifies  as  hyperactivated. 

3.6  Sensitivity  Analysis. 

Earlier,  we  stressed  the  undesirability  of  forming  a  model  based  on  a  data  set 
and  evaluating  the  efficiency  of  the  model  based  on  that  same  data  set.  Thus  far,  to  facilitate 
model  comparison,  the  resubstitution  error  has  been  used  for  computing  efficiency. 

However,  it  should  be  noted  that  several  different  methods  of  cross  validation,  including 
jackknifed  estimates,  random  subsets,  and  the  test  sample  method  were  also  employed.  In 
general,  we  found  that  the  best  linear  models  and  CART  were  resistant  to  changes  in  the  data 
from  cross  validation  efforts.  The  efficiency  according  to  cross  validation  among  the  various 
methods  was,  at  worst,  98%. 
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Figure  3.  A  comparison  of  discriminant  and  CART  models  to 

classify  hyperactivation.  The  figure  shows  the  model 
rules  for  classifying  cells  and  indicates  the  correctness 
of  those  classifications.  The  white  region  is  the 
hyperactivated  region  for  CART  The  hyperactivated 
region  for  the  discriminant  model  is  the  area  below  the 
bold  line. 
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3.7 


Motion  Analysis  System  ('CellTrakV 


The  previous  analysis  was  based  on  the  motion  parameter  values  as  measured 
by  the  CellSoft  system.  To  determine  how  the  resulting  model  performs  when  applied  to 
parameter  values  measured  by  another  system,  the  tapes  were  reanalyzed  using  the  Motion 
Analysis  System.  Although  the  tapes  were  the  same,  it  was  impossible  to  determine  if 
exactly  the  same  cells  were  being  analyzed.  The  number  of  cells  analyzed  was  1119, 
somewhat  less  than  the  1221  analyzed  by  CellSoft.  Table  6  summarizes  the  individual 
parameters  for  the  Motion  Analysis  System.  This  investigation  shows  that  some  motility 
parameters  are  different  than  those  reported  by  CellSoft.  The  largest  difference  occurs  with 
AALH.  CellTrak  seems  to  approximately  double  the  AALH  values  relative  to  CellSoft. 
Another  difference  noted  is  with  VST  values,  particularly  among  the  non-hyperactivated 
cells.  The  VST  as  measured  by  CellTrak  is  approximately  12  /xm/s  slower  for  non- 
hyperactivated  cells  as  that  measured  by  CellSoft.  Other  differences  include  means  for  LIN, 
and  WOB  for  the  non-hyperactivated  cells  and  predictably  means  for  AALH/LIN  and 
VC*AALH  for  all  cells.  Still,  scatter  plot  examination  (not  shown)  reveals  a  similar  data 
structure  between  parameters  to  that  observed  with  CellSoft.  Implementation  of  the  CART 
decision  rule  based  on  the  CellSoft  data  to  the  data  produced  by  CellTrak  yielded 
surprisingly  good  results.  Forty  cells  were  misclassified  for  an  efficiency  of  96.4%.  In  an 
effort  to  calibrate  the  model  for  the  system  being  used,  CART  was  performed  on  the 
CellTrak  data  to  determine  a  model  best  suited  for  classifying  this  new  data.  CART  again 
picked  WOB  and  VC  together  with  the  same  tree  structure  to  predict  hyperactivity!  The  rule, 
only  slightly  different  than  that  for  CellSoft,  would  be  to  classify  as  hyperactive  cells 
showing  WOB  <  0.705  and  VC  >  49.2.  The  number  of  cells  misclassified  was  30  for  an 
efficiency  of  97.3%.  Cross  validation  results  reported  efficiencies,  at  worst,  of  97%. 

3.8  Data  Cleansing. 

A  further  check  on  the  model  validity  involved  reexamining  each  of  the  cell 
tracks  analyzed  by  CellSoft  and  CellTrak.  Sperm  cells  that  were  in  the  gray  area  for 
hyperactivated  motility  were  removed  from  the  data  sets,  and  the  CART  routine  was  repeated 
for  both  the  CellSoft  and  CellTrak  results.  With  CellSoft,  13  cells  were  removed,  and  no 
change  at  all  was  recorded  for  the  decision  rule  values  of  0.775  for  WOB  and  50.5  ^m/s 
for  VC.  For  CellTrak,  40  cells  were  removed  with  a  slight  change  in  values.  The  WOB 
partition  changed  from  0.705  to  0.685,  and  the  VC  partition  changed  from  49.2  ^m/s  to 
54.9  jum/s.  The  new  efficiencies  for  CellSoft  and  CellTrak  were  98.7%  and  98.4%, 
respectively,  computed  as  a  cross  validation  efficiency. 


4.  CONCLUSIONS 

Overall,  the  CART  model  based  on  VC  and  WOB  is  preferred.  It  certainly 
performs  better  than  the  discriminant  or  regression  models  based  on  the  same  motility 
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Table  6.  Summary  Statistics  for  the  Motility  Parameters  Determined  by 
CellTrak  for  both  Hyperactivated  and  Non-hyperactivated  Cells 


Motility 

Parameter 

Class 

1  Mean  ±  SD 

Sample 

Minimum 

Maximum 

Range 

VC 

hyper 

136.9  ±  51.1 

50.4 

383.5 

333.2 

non-hyper 

77.9  ±  37.0 

25.0 

186.0 

161.0 

VST 

hyper 

28.5  ±  18.3 

0.0 

109.3 

109.3 

non-hyper 

57.9  ±  38.8 

0.0 

174.0 

174.0 

LIN 

hyper 

0.24  +  0.17 

0.01 

0.82 

0.81 

non-hyper 

0.73  ±  0.26 

0.01 

0.99 

0.98 

AALH 

hyper 

14.3  ±  6.5 

1.5 

63.0 

61.5 

non-hyper 

3.9  ±  2.1 

1.3 

13.0 

11.7 

STR 

hyper 

0.53  ±  0.27 

0.0 

0.99 

0.99 

non-hyper 

0.84  ±  0.22 

0.0 

0.99 

0.99 

WOB 

hyper 

0.42  ±  0.15 

0.14 

0.90 

0.76 

non-hyper 

0.84  ±  0.21 

0.12 

0.99 

0.77 

AALH/LIN 

hyper 

122.1  ±  156.7 

4.4 

1575.8 

1571.4 

non-hyper 

11.3  ±  31.9 

1.4 

390.0 

388.6 

VC*AALH 

hyper 

2153.2  ±  1708.3 

223.4 

12434.4 

12211.0 

non-hyper 

327.5  ±  290.1 

38.0 

2275.0 

2237.0 

Summary  statistics  for  each  motility  parameter  are  reported  indivi¬ 
dually  for  each  cell  state.  The  mean  ±  the  sample  standard  devia¬ 
tion,  for  the  population,  gives  information  as  to  the  location  of  the 
majority  of  motility  parameter  values.  The  minimum,  maximum, 
and  range  provide  information  as  to  the  extremes.  All  summary 
statistics  were  computed  using  BMDP  statistical  software. 
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parameters,  and  much  better  than  linear,  discriminant,  linear  regression,  or  CART  models 
based  on  the  motility  parameters  most  commonly  used  in  the  literature.^  '*  ” 

The  classification  rule  for  analysis  with  the  CellSoft  system  is  WOB  <  0.775 
and  VC  >  51  ^m/s.  For  the  CellTrak  system,  the  rule  is  WOB  <  0.705  and  VC  > 

50  /tm/s,  or  for  a  more  restricted  classification  WOB  <  0.685  and  VC  >  55  ^m/s. 
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APPENDIX 

MOTILITY  PARAMETER  DISTRIBUTIONS  (BY  HYPERACTIVITY) 
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